This document is an example Cygnus Report that can be generated at the last step of Cygnus data analysis. Each plot shown in this file is generated when the corresponding R markdown file is knitted to HTML, and hence can be reused with another Cygnus data to generate a similar report. All codes are hidden in the final HTML file to simplify the report.

Object Summary

## CygnusData Object: 
## Number of EVs:  107871 
## Number of markers:  12 
## Present Metadata:  stage image

Data Exploration

This section includes heatmap of average expressions and distributions of markers. Based on the distribution, the cutoff for binary conversion can be determined.

Marker co-localization analysis

Cygnus offers two methods of marker co-expression analysis. The first requires binary conversion, and simply concerns co-occurrence of different combinations of markers. The benefit of this analysis is that it allows statistical analysis of all possible combination of markers.

Marker correlation analysis

The following plot shows pearson correlation coefficients of correlation in expressions of two markers.

Dimensionality Reduction

## Read the 107871 x 12 data matrix successfully!
## Using no_dims = 3, perplexity = 30.000000, and theta = 0.500000
## Computing input similarities...
## Building tree...
##  - point 10000 of 107871
##  - point 20000 of 107871
##  - point 30000 of 107871
##  - point 40000 of 107871
##  - point 50000 of 107871
##  - point 60000 of 107871
##  - point 70000 of 107871
##  - point 80000 of 107871
##  - point 90000 of 107871
##  - point 100000 of 107871
## Done in 18.05 seconds (sparsity = 0.001169)!
## Learning embedding...
## Iteration 50: error is 125.858025 (50 iterations in 32.05 seconds)
## Iteration 100: error is 125.858025 (50 iterations in 45.65 seconds)
## Iteration 150: error is 125.858003 (50 iterations in 43.03 seconds)
## Iteration 200: error is 125.580744 (50 iterations in 39.80 seconds)
## Iteration 250: error is 112.690441 (50 iterations in 32.40 seconds)
## Fitting performed in 192.92 seconds.

Principal Component Analysis finds linear combinations that captures as much variability within the dataset and projects the multi-dimensional data into axes represented by those linaer combinations.

t-distributed stochastic neighbor embedding is a non-linear dimensionality method.

Clustering Analysis

The code uses mini-batch K-means clustering. The result can be visualized through dimensionality reduction plots, such as PCA, tSNE or UMAP.

## Warning: `predict_MBatchKMeans()` was deprecated in ClusterR 1.3.0.
## ℹ Beginning from version 1.4.0, if the fuzzy parameter is TRUE the function
##   'predict_MBatchKMeans' will return only the probabilities, whereas currently
##   it also returns the hard clusters
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## cluster_assignments
##     1     2     3     4     5     6     7     8     9    10 
##  3400 25837  7096  5278  3140 11454  7595 15597 21852  6622

Cluster Characterization

Further downstream analysis for cluster characterization includes expression heatmap, and proportions analysis.

Cygnus!